Prevent error loading pdf: Added pymupdf/fitz open to get page count #225
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
When adding some pdfs to the Docs class, I often received the following error (in case it matters, they were docs returned by
zotero.iterate
). Further inspection revealed pypdf was having trouble parsing the pdf and thepypdf.PdfReader
returned an object that couldn't be evaluated with thelen
function, which is required for theutils.count_pdf_pages
function.Although paper-qa reads pdf text with fitz by default, the
utils.count_pdf_pages
function is only written for pypdf. Therefore, my pull request simply modifies it to use fitz there as well.